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Abstract — For successful estimation, the usual network to- 
mography algorithms crucially require i) end-to-end data 
generated using multicast probe packets, real or emulated, 
and ii) the network to be a tree rooted at a single sender 
with destinations at leaves. These requirements, consequently, 
limit their scope of application. In this paper, we address 
successfully a general problem, henceforth called generalized 
network tomography, wherein the objective is to estimate 
the link performance parameters for networks with arbitrary 
topologies using only end-to-end measurements of pure unicast 
probe packets. Mathematically, given a binary matrix A, we 
propose a novel algorithm to uniquely estimate the distribution 
of X, a vector of independent non-negative random variables, 
given only IID samples of the components of the random vector 
Y — AX. This algorithm, in fact, does not even require any 
prior knowledge of the unknown distributions. The idea is to 
approximate the distribution of each component of X using 
linear combinations of known exponential bases and estimate 
the unknown weights. These weights are obtained by solving 
a set of polynomial systems based on the moment generating 
function of the components of Y. For unique identifiability, 
it is only required that every pair of columns of the matrix 
A be linearly independent, a property that holds true for the 
routing matrices of all multicast tree networks. Matlab based 
simulations have been included to illustrate the potential of the 
proposed scheme. 

I. Introduction 

Network tomography, first proposed in [1], is the science 
of inferring spatially localized network behavior using only 
measurable metrics. The problems considered in network 
tomography can be classified into two broad strands: i) traffic 
demand tomography — determination of source destination 
traffic volumes via measurements of link volumes and ii) 
network delay tomography — link parameter estimation based 
on end to end path level measurements. However, central to 
both these areas, is the problem of inferring the statistics of 
X, a vector of independent non-negative random variables, 
given the measurement model Y = AX. The challenge in 
these problems stems from the fact that A is usually an ill 
posed matrix and hence non-invertible. For excellent tutorials 
and surveys on the state of art, see [2], [3], [4] and [5]. 

For sake of definiteness, we consider here the problem of 
network delay tomography. The proposed method, however, 
is also applicable to traffic demand tomography. Under delay 
tomography, the major problems studied include estimation 
of bottleneck link bandwidths, e.g. [6], [7], link loss rates, 
e.g. [8], link delays, e.g., [9], [10], [11], [12], etc. For suc- 
cessful estimation, the proposed solutions to these problems 
crucially require i) end-to-end data generated using multicast 



probe packets, real or emulated, and ii) the network to be a 
tree rooted at a single sender with destinations at leaves. 
These algorithms mainly exploit the correlations in the path 
measurements, i.e., the packets have the same experience on 
shared links. Because of this, any divergence in either of 
the above requirements results in performance degradation. 
Consequently, there is a need to develop tomography algo- 
rithms for networks with arbitrary topologies using only pure 
unicast probe packet measurements. Mathematically, this is 
same as addressing the generalized network tomography 
problem (GNT), wherein, given the binary matrix A, the 
objective is to estimate the statistics of X, a vector of 
independent non-negative random variables, given only IID 
samples of the components of the random vector Y = AX. 

In this paper, we propose a novel method for the frame- 
work of GNT to accurately estimate the distribution of X 
even when no prior knowledge about the same is available. 
We rely on the fact that the class of generalized hyper- 
exponential (GH) distributions is dense in the set of non- 
negative distributions (see [13]). Using this, the idea then 
is to approximate the distribution of each component of X 
using linear combinations of known exponential bases and 
estimate the unknown weights. These weights are obtained 
by solving a set of polynomial systems based on the moment 
generating function of the components of Y. For unique 
identifiability, it is only required that every pair of columns 
of the matrix A be linearly independent, a property that holds 
true for the routing matrices of all multicast tree networks. 

The rest of the paper is organized as follows. In the next 
section, we develop the notation and formally describe the 
problem. Section|III]recaps the theory of approximating non- 
negative distributions using linear combinations of expo- 
nentials. In Sections IV and [V] we develop our proposed 
method and demonstrate its universal applicability. We give 



numerical examples in Section VI and end with a short 
discussion in Section \VU\ 



II. Model and Problem Description 

Any cumulative distribution function (CDF) that we work 
with is always assumed to be continuous with support 
(0, oo ). The moment generating function (MGF) of the 
random variable X will be Mx(t) = E(exp(— tX)). For 
n e N, we use [n] and S n to represent respectively the 
set {1, ,..,n) and its permutation group. All vectors are 
column vectors and their lengths refer to the usual Euclidean 
norm. For 6 > 0, i?(v; S) represents the open <5 — ball around 



the vector v. To denote the derivative of the map / with 
respect to x, we use /(x). The notation =>■ stands for weak 
convergence of distributions. Lastly, all empty sums and 
empty products equal and 1 respectively. 

Let Xi, . . . , Xn denote the independent non-negative ran- 
dom variables whose distribution we wish to estimate. We 
assume that each X, has a GH distribution of the form 



Fj{u) 



YfktX w jk [1 - exp (-A fe u)] , u>0 (1) 

where A fe > 0, J2tt\ Wjk^k exp(-A fe u) > and 
Sfcii w jk = 1- Further, we suppose that Ai, . . . , A^+i are 
distinct and explicitly known and that the weight vectors of 
distinct random variables differ at least in one component. 
Let A G {0, iy nxN denote an a priori known matrix which 
is 1— identifiable in the following sense. 

Definition 1: A matrix A is fc— identifiable if every set of 
2fc of its columns is linearly independent. 

Let X = (Xx, X N ) and Y = AX. For each i G [to], 
we presume that we have access to a sequence of IID 
samples of the i th linear combination Y, . Our problem then 
is to estimate for each Xj, its vector of weights Wj = 
(Wj%, . . . , Wjd) and consequently its complete distribution 
Fj, since w j{d+1) = 1 - J2k=i w jk- 

Before developing the estimation procedure, we begin by 
making a case for the distribution model of ([TJ. 

III. Approximating Distribution Functions 

Consider the problem of simultaneously estimating the 
members of a finite family of arbitrary distributions, say 
Q = {G , . . . , G N }. A useful strategy is to approximate each 
member by a GH distribution. The CDF of a GH random 
variable X is given by 



Fx(u) = J2tt\ a k [1 - exp (-Afett)] , u > 0, 



(2) 



where Afe > 0, J2k=\ a k^k exp(— Afeit) > and 
a k — 1- Consequently, its MGF is given by 



M x (t) 



d+l 



Afe 
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Xk+t 

k—1 



(3) 



In addition to the simple algebraic form of the above 
quantities, the other major reason to use the GH class is 
that, in the sense of weak topology, it is dense in the set 
of distributions (see [13]). In fact given a continuous CDF 
F with MGF M, one can explicitly find a sequence of 
GH distributions, say F n , that converge uniformly to it. 
Furthermore, if M n is the MGF for F n , then for each t > 0, 
M n (t) — » M(t). The idea is based on the following result. 

Theorem 1: [14] For n, k € N, let X n ^ be a nonnegative 
GH random variable with mean k/n, variance cr^ k and CDF 
W„,fc. Suppose 

1) the function v : N — > N satisfies lim v(n)jn = oo. 



1) the function F n given by 

v(n) 

F n (u) = J2{F(k/n)-F((k-l)/n)}W n , k (u) 
fe=i 

+(1 - F{u(n)/n))W nMn)+1 (u) 

is a GH distribution for every n £ N and 

2) F n F. 

Furthermore, if F n is a sequence of CDFs converging weakly 
to a continuous distribution function F, then 

lim sup \F n (u) -F(u)\ =0. 

Pi-— >oo_ 00<ti<CJ0 

Observe that for each n, the exponential stage parameters 
of F n depend only on the choice of the random variables 
{X n f~ : 1 < k < v{n) + 1}. Regarding Q, if we fix the 
random variables X n ,k and let M i denote the MGF of G l , 
then this observation and the above results imply that for any 
given ei, £2 > and any finite set r ~ {ti, . ■ ■ , ife} C K+, 
3n = n(ei, 62, t) e N such that for each i G [N], G % and its 
n th GH approximation, F^, are t\— close in the sup norm 
and for each j G [fc], \M\tj) - M^(tj)\ < e 2 - Note that 
the exponential stage parameters are explicitly known and 
identical across the approximations F^, which now justifies 
our model of ([T}. 

The problem of estimating the members of Q can thus 
be reduced to determining the vector of mixing weights that 
characterizes each approximation and hence each distribu- 
tion. 

IV. Generalized Network Tomography 

With pi := {j 6 [N] : a^j = 1}, V« G [to], our strategy 
is as follows. For each i, we use the IID samples of Yi to 
estimate its MGF and subsequently build a polynomial sys- 
tem, say Hi(x) = 0. We call this the elementary polynomial 
system (EPS). We then show that for each i G [to] and each 
j G Pi, a close approximation of the vector Wj is present 
in the solution set of i?i(x) = 0, denoted V(Hi). To match 
the weight vectors to the corresponding random variables, 
we make use of the fact that A is 1— identifiable. 

A. Construction of Elementary Polynomial Systems 

Fix i G [to] and suppose that \pi\ = AT i; i.e., Yi is a sum 
of Ni random variables, which for convenience, we relabel 
as Xl, .. . ,Xv; in some order. Using ([TJ and ([3]), observe 
that the MGF of Y l is well defined Vi G M++ and satisfies 
the relation 



Ni (d+l 



^(t)=niE w i*\A t+ /, 



j=l U=l 



Ai. 



(4) 



On simplification, after substituting Wj^+i) = 1 



2) there exists < s < 1 such that lim n 1+s al Jk = l^k=i w jk, we get 



uniformly with respect to fc. 
Then given any continuous distribution function F, the 
following holds: 



n 



y^wjfcA fc (f) + A d+ i 



,fc=i 



(5) 



where A k (t) = {X k - X d+1 )t/(X k + t) and Hi(t) = 
M Yi {t)(\ d+1 +t) N *. 

For now, let us assume that we know My i (t) and hence 
Pi{t) exactly for every valid t. We will refer henceforth to 
this situation as the ideal case. Treating t as a parameter, we 
can then use (5]l to define a canonical polynomial 



/(x;t) 



where x = (xi , 



,fc=i 



, x^Vi ) with x.. 



(6) 



,Xjd)- As this 



is a multivariate map in d ■ N{ variables, we can choose an 
arbitrary set r = {tx, . . . , td-Ni} c ^++ consisting of dis- 
tinct numbers and define an intermediate square polynomial 
system 

F T (x) = (/ 1 (x),...,/ d . JV4 (x))=0, (7) 

where / fe (x) = f(x;t k ). 

Since (7]i depends on choice of r, analyzing it directly 
is difficult. But observe that i) the expansion of each /„ or 
equivalently (|6} results in rational coefficients in t of the 
form A L = A^ 1 (i)---A^(<)A < ^+ 1 , where L k G Z + and 
Sfeii Lk = Ni, and ii) the monomials that constitute each 
polynomial are identical. This suggests one may be able to 
get a simpler representation for ((TJ. We do so in the following 
three steps, where the first two focus on simplifying (6]). 
Step 1 -Gather terms with common coefficients: Let 



(Li 



, L d +i) G Z 



d+l 



d+1 

fc=i 



Ni 



For a vector b = . . . , 6jvJ G [d + l]^ 1 , let its type be 
denoted by 9(b) = (6>i(b), .*. . , 0«j+i(b)), where 6> fc (b) is 
the count of the element k in b. For every L G Ad+i,jv 4 ) 
additionally define the set 

S L = {b = (&!,.. .,b Ni )£[d+l} Ni : 9(b) =i} 

and the polynomial g(x;L) = £ beBi (Il^i, b i7 td+i x ibj)- 
Then collecting terms with common coefficients in (6]), the 
above notations help us rewrite it as 



/(x;i) = 



E 



<7(x;L)A L - M i(t). 



(8) 



Step2-Coefficient expansion and regrouping: Using an idea 
similar to partial fraction expansion for rational functions in 
t, the goal here is to decompose each coefficient into simpler 
terms. For each j, k G [d], let 



fa 



For each L G A d +i,N t : let V(L) := {k G [d] : L fc > 0}. 
Further, if X>(L) ^ then Vfc G P(L) and Vg G let 




A fc9 (L) 



:= {s=(si,. 
Vr G P(L) 



. , s d ) G Z° : s r = 

d 

U {fc}; 2J s„ = L fe 

n=l 



q} and 



7„(L):= II fc\ E II ("t-T 1 )^ 

rSZ>(L) [seA fc ,(L)reX)(L) 

where (^^-i^ 1 ) = (£ + -i)\s^\ ■ ^ e desired expansion is 
now given in the following lemma. 

Lemma 1: A L = £ fceP(L) E^i 7* g (L)A^)A^ if 
T>(L) 0. Further, this expansion is unique and holds Vt. 

Proo/- See [15]. ■ 
Applying this expansion to each coefficient in (8]i and 
regrouping shows that it can be rewritten as 



d Ni 



/(x; *) = e E h ^) A imz~i q - c (^)> 

k=l q=l 

where c(i) = /Ltj(t) — A^ and 

7ftg( L ) 



E 



A 



(9) 



(10) 



LGA d+ i iN . , Lfc>q /v d+l 

Step 3 -Eliminate dependence on r: The advantage of (9]) is 
that apart from c(t), the number of t-dependent coefficients 
equals d ■ Ni, which is exactly the number of unknowns in 
the polynomial /. Further, as shown below, they are linearly 
independent. 

Lemma 2: For k G [d-JVj], let b k = min{j G [d] : j-Ni > 
k}. Then the matrix T T , where for j, k G [d ■ Ni] 



(Tr)j* 



A 



(11) 



is non-singular. 

Proof: See [15]. ■ 
Now observe that if we let c r = (e(ti), . . . , c^-at,)), 
£ fe (x) ee (/i fe i(x),...,^ feiVi (x)) and £(x) = (£i(x), 
£<j(x)), then (|7]i can be equivalently expressed as 



T T £(x) — c T 

Premultiplying (12) by (T r ) _1 
Lemma |2j we obtain 



= 0. (12) 
which now exists by 



0. 



(13) 



A crucial point to note now is that w ee (wi, . . . , Wjvj is 
an obvious root of (6]) and hence of ( fT3| . This immediately 
implies (T T ) 1 c r = £(w) and consequently fL3] l can 
rewritten as 

Hi(x) = f (x) - f (w) = 0. (14) 

Note that ( fT"4"| ) is devoid of any reference to the set r and 
can be arrived at using any valid r. Furthermore, because 
of the equivalence between (7]i and ( fl4) , any conclusion that 
we can draw for ( fL4] > must hold automatically for the system 
of Q. Because of these reasons, we will henceforth refer to 
( fT4l > as the EPS. 

Example 1: Let Ni = d = 2. Also, let Ai = 5, A2 = 3 
and A3 = 1. Then the map £ described above is given by 

I + a; 2 i + 5(a;iia;22 + X12X21) \_ 

X12 + X 2 2 - &{x\iX 2 2 + X 12 X 2 l) 

V ^12^22 / 



f(x) 



(15) 



We next describe some special features of the EPS. Let 
x ff := (x CT n), . . . ,x a (js[A), a £ denote a permutation 
of the vectors xi, . . . , xjv ; and 7r x := {x CT : a £ S^}. 

Lemma 3: -ffj(x) = iJ,*(x CT ), Vcr £ Sn ( . That is, the map 
Hi is symmetric. 

Proof: Observe that iJ 4 (x) = {T T )- 1 F T (x) and F T , as 
defined in |7]), is symmetric. The result thus follows. ■ 

Next recall that if the complement of a solution set of a 
polynomial system is non-empty then it must be open dense 
in the Euclidean topology. This fact and the above result help 
us now to show that the EPS is almost always well behaved. 

Lemma 4: There exists an open dense set 1Z of ~R d ' Ni 
such that if w £ 1Z then the solution set of the EPS satisfies 
the following properties. 

1) w e V{Hi). 

2) If x* £ V(Hi), then tt x . cV(Hi). 

3) \V(Hi)\ = k x Nil, where k € N is independent of 
weK. Further, each solution is non-singular. 

Proof: See [15]. ■ 
We henceforth assume that weR. Property [2] above then 
suggests that it suffices to work with 

Mi = {a e C d : 3x* € V{H t ) with x* = a}. (16) 

Observe that Wj := {wi, ...,wjv i }cA / Ji.A point to note 
here is that Zj := A / Ji\Wi is not empty in general. 

Our next objective is to develop the above theory for the 
case where for each i € [m], instead of the exact value 
of My^t), we have access only to the IID realizations 
{Yu}i>i of the random variable Yi. That is, for each k £ 
[N], we have to use the sample average My\ = 
(y^f—i exp(— t k Yu)/Lj for an appropriately chosen large 

L, c(t k ;L) = ~ A%(< fc )(A d+ i + i fe ) Wl and c T . L = 

(c(ti; L), . . . , c{td-Ni\ L)) as substitutes for each My^ik), 
each c(tfe) and c T respectively. But even then note that the 
noisy or the perturbed version of the EPS 

^(x) = £(x)-(T T )- 1 (c T , i ) = 0. (17) 

is always well defined. More importantly, the perturbation is 
only in its constant term. As in Lemma [3] it then follows 
that the map Hi is symmetric. 

Next observe that since 1Z is open (see Lemma [4]), there 
exists a small enough 5i > such that B(w; Si) CTZ. Using 
the regularity of solutions of the EPS (see Property 3 of 
Lemma the inverse function theorem then gives us the 
following result. 

Lemma 5: Let 5 £ (0, 5A be such that for any two distinct 
solutions in V(Hi), say x* and y*, B(x*; S)f]B{y*; S) = 0. 
Then there exists an e(S) > such that if u £ M. d ' Ni and 
||u - £(w)|| < e(S) then the solution set V(Hi) of the 
perturbed EPS £ (x) — u = satisfies the following: 

1) All roots in V(Hi) are regular points of the map £ . 

2) For each x* £ V(Hi), there is a unique z* £ V(Hi) 
such that ||x* — z*|| < S. 

3) For each z* £ V(Hi), there is a unique x* £ V(Hi) 
such that ||x* — z*|| < S. 

Proof: See [15]. ■ 



The above result, in simple words, states that if we can get 
hold of a close enough approximation of £ (w), say u, then 
solving the perturbed EPS £(x) — u = is almost as good 
as solving the EPS of ( fB) , We now show how to acquire 
such an approximation of £(w). 

Lemma 6: Let S and e(<5) be as described in Lemma [5] 
Then for tolerable failure rate n > and the chosen set r, 
3£ t ,i5,k G N such that if L > L Ty s. K then with probability 
greater than 1 — k, we have ||(T T ) _1 c T i — £(w)|| < e(S). 

Proof: Note that exp(-t k Yu) £ [0, 1] Vi, I and fc. The 
Hoeffding inequality (see [16]) then shows that for any e > 
0, Pr{\M Yt {t k ; L) - M Yi \tk)\ > e} < exp(-2e 2 L). Since 
f (w) = (T T ) _1 c T , the result is now immediate. ■ 

Let us now fix a L > L T< g tK and let Ai(n) denote the event 
HCZV)- 1 ^ -£(w)|| < e{8). Clearly, Pr{^?(«)} < «. 
Observe that when ^(k) is a success the solution set of 
\Y1\ , with L as chosen above, satisfies all properties given 
in Lemma [5] Because of the symmetry of the map Hi, as in 
( fT6l >, it again suffices to work with 

A = {a £ C d : 3z* £ V{H t ) with z* = a}. (18) 

We are now done discussing the EPS for an arbitrary i £ 
\m]. In summary, we have managed to obtain a set M.i in 
which a close approximation of the weight vector of random 
variables Xj that add up to give Yi is surely present. The next 
subsection takes a unified view of the solution sets {Aii : 
i £ [m]} to match the weight vectors to the corresponding 
random variables. But before that, we redefine VVj = {w^ : 
j £ pi}. Accordingly, Mi,Mi,V(Hi) and V(-ffj) are also 
redefined using notations of Section |H| 

B. Parameter Matching using 1-identifiability 

We begin by giving a physical interpretation for the 
1 identifiability condition of the matrix A. For this, let 
Gj ■= {i € [m] : j £ Pi] and B 3 := [m]\Gj- 

Lemma 7: For a 1— identifiable matrix A, each index j £ 
[N] satisfies 

{.?}= n n pi=-- v r 

Proof: By definition, j £ Vj . For converse, if k £ T>j , 
k ^ j, then columns j and k of A are identical; contradicting 
its 1— identifiability condition. Thus {j} =T>j. ■ 

An immediate result is the following. 

Corollary 1: Suppose A is a 1— identifiable matrix. If the 
map u : [N] — > X, where X is an arbitrary set, is bijective 
and Vi £ [to], Vi := {u(j) : j £ pi}, then for each j £ [N] 

Mm = n v 9 n n < 

geGj beBj 
By reframing this, we get the following result. 

Theorem 2: Suppose A is a 1— identifiable matrix. If the 

weight vectors wi,...,wjy are pairwise distinct then the 

rule 

i> ■■ j ; -+ n w 3 n n (i9) 

geGj beBj 

satisfies ijj(j) = Wj, 



This result is where the complete potential of the 1— 
identifiability condition of A is being truly taken advantage 
of. What this states is that if we had access to the collection 
of solution sets {Wi : i £ [to]} then using ip we would 
have been able to uniquely match the weight vectors to the 
random variables. But note that, at present, we have access 
only to the collection {Mi : i £ [to]} in the ideal case and 
{Mi : i G [to]} in the perturbed case. Keeping this in mind, 
our goal now is to show that if Vii,i2 G [to], i\ ^ ii, 

l tl n M l2 = 0, (20) 

a condition that always held in simulation experiments, then 
the rules (with minor modifications): 

f]M g n f]M c b (21) 

for the ideal case, and 

i> : j p| M g n f)M c b (22) 

in the perturbed case, recover the correct weight vector 
associated to each random variable Xj. 

We first discuss the ideal case. Let S := {j G [N] : \Gj\ > 
2}. Because of |20} and Theorem [2] note that 

1) If j G S, then ip(j) = {wj. 

2) If j G S c , j G ft. then - {w,} Ul,.. 

That is, pi) works perfectly fine when j G S. The problem 
arises when j G 6> c as does not give as output a unique 
vector. To correct this, fix j G 6> c . If j G pj» then let v suh = 
(w/j : fc G Pi*\{i})- Because of 1— identifiability, note that 
if k G Pi*\{j} then fc G S. From (|6]l and Q, it is also 
clear that (v sub ,a) G V(Hi) if and only if a = w,. This 
suggests that we need to match parameters in two stages. In 
stage 1, we use pT) to assign weight vectors to all those 
random variables Xj such that j G S. In stage 2, for each 
j G <S C , we identify i* G [to] such that j G p,:*. We then 
construct \- sub . We then assign to j that unique a for which 
(v sub ,a) £ V(ifj*). Note that we are ignoring the trivial 
case where |pj* | = 1. It is now clear that by using ( pTj i with 
modifications as described above, at least for the ideal case, 
we can uniquely recover back for each random variable Xj 
its corresponding weight vector Wj. 

We next handle the case of noisy measurements. Let U := 
Uie[ m ]A^i and U := U ie \ m iMi. Observe that using p2| ) 
directly will, with probability one, satisfy ip{i) = f° r eacn 
j G [N\. This happens because we are distinguishing across 
the solution sets the estimates obtained for a particular weight 
vector. Hence as a first step we need to define a relation 
~ onW that associates these related elements. Recall from 
Lemmas |]and|6] that the set M4 can be constructed for 
any small enough choice of 6, n > 0. With choice of S that 
satisfies 

< 45 < min ||a-/3||, (23) 

a,/3eU 

let us consider the event A := n,- e r m i.Aj(fi;/m). Using a 
simple union bound, it follows that Pr{„4 c } < n. Now 



suppose that the event A is a success. Then by ( |23] l and 
Lemma |5j the following observations follow trivially. 

1) For each i G [to] and each a G A4i, there exists at 
least one a G Mi such that ||d — a\\ < S. 

2) For each i G [to] and each a G Mi, there exists 
precisely one a G Mi such that ||d — a|| < 6. 

3) Suppose for distinct elements a, ft G U, we have 
a,j3 £U such that ||d - a|| < 6 and \\p - /3\\ < S. 
Then \\a - f3\\ > 25. 

From these, it is clear that the relation ~onW should be 

d~ $ iff ||d-/3|| < 2^. (24) 

It is also easy to see that, whenever the event A is a success, 
~ defines an equivalence relation on U. For each i G [to], 
the obvious idea then is to replace each element of Mi 
and its corresponding d— dimensional component in V(Hi) 
with its equivalence class. It now follows that |22|, with 
modifications as was done for the ideal case, will satisfy 

ip(j) = {aeti:\\a-Wj\\<5}. (25) 

This is obviously the best we could have done starting from 
the set {Mi : i G [m]}. 

We end this section by summarizing our complete method 
in an algorithmic fashion. For each i G [to], let {i^/};>i be 
the IID samples of Y^. 

Algorithm 1: Distribution tomography 

Phase 1: Construct & Solve the EPS. 
For each i G [to], 

1) Choose an arbitrary r = {ti, . . . ,id.jv 4 } °f distinct 
positive real numbers. 

2) For a large enough L G N and each tj G t, 

set Myj(*j) = fe£=i expt-tj-Yii)) /£, = 

(Ad+i + tj^My^tj) and c(t,) = £ife) - A^ r 
Using this, construct c r = (c(ti), . . . , c(td-Ni))- 

3) Solve f (x) — T^c-r = using any standard solver for 
polynomial systems. 

4) Build Mi = {a G C d : 3x* G U(i^) with xj = a}. 
Phase 2: Parameter Matching 

1) Set U :— Uj£[m] Choose 5 > small enough and 
define the relation ~ onW, where a ~ (3 if and only 
if ||d — /3 1 1 2 < 25. If ~ is not an equivalence relation 
then choose a smaller 5 and repeat. 

2) Construct the quotient set U\ ~ . Replace all elements 
of each Mi, V{Hi) with their equivalence class. 

3) For each j G 5, set 

^(i) = (n 9e ^-M 9 )n(a eB ^g)- 

4) For each j £ S c , 

a) Set i* = i E [to] such that j <G p;* . 

b) Construct v sub = (ip(k) :k£p,,,k^ j). 

c) Set = a such that (V M \d) G 

V. Universality 

The crucial step in the method described above was to 
come up with, for each i £ [to] , a well behaved polynomial 
system, i.e., one that satisfies the properties of Lemma [4] 



based solely on the samples of Y^. Once that was done, 
the ability to match parameters to the component random 
variables was only a consequence of the 1— identifiability 
condition of the matrix A. This suggests that it may be 
possible to develop similar schemes even in settings different 
to the ones assumed in Section [TT] In fact, functions other 
than the MGF could also serve as blueprints for constructing 
the polynomial system. We discuss in brief few of these 
ideas in this section. Note that we are making a preference 
for polynomial systems for the sole reason that there exist 
computationally efficient algorithms, see for example [17], 
[18], [19], [20], to determine all its roots. 

Consider the case where Vj € [N], the distribution of Xj 
is a finite mixture model given by 

dj+i 

F j( u ) = X! w jk(/>jk(u), (26) 

k=l 

where dj £ N, Wji, . . . , Wjr^+i) denote mixing weights, 
i.e., Wjk > and J^k^ 1 w jk = lj and {4>jk(u)} are some 
basis functions, say Gaussian, uniform, etc. The MGF is thus 
given by 

d 3+ 1 rOO 

M Xj (t) = ^2 w jk / exp(-ut)d^ jk (u). (27) 
k=i Ju =° 

Note now that if the basis functions {4>jk\ are completely 
known then the MGF of each Yi will again be a polynomial 
in the mixing weights, {wjk}, similar in spirit to the relation 
of Q. As a result, the complete recipe of Section |IV] can 
again be attempted to estimate the weight vectors of the 
random variables Xj using only the IID samples of each 
Y t . 

In relation to ([TJ or ( |2"6| ), observe next that Vn £ N, the 
n th moment of each Xj is given by 

E ( X ?) = Yl w i k / und <t>jk(u). (28) 
k=i Ju=0 

Hence, the n th moment of Yi is again a polynomial in the 
unknown weights. This suggests that, instead of the MGF, 
one could use the estimates of the moments of Yi to come 
up with an alternative polynomial system and consequently 
solve for the distribution of each Xj . 

Moving away from the models of ([T} and ( po} , suppose 
that for each j £ [N], Xj ~ exp(rrij). Assume that each 
mean rrij < oc and that mj 1 ^ nij 2 when ji ^ j 2 . We 
claim that the basic idea of our method can be used here to 
estimate mi, . . . , and hence the complete distribution of 
each Xj using only the samples of Yi. As the steps are quite 
similar when either i) we know My i if) for each i £ [to] and 
every valid t and ii) we have access only to the IID samples 
{Yu}i>i for each i £ [m], we take up only the first case. 

Fix i £ [to] and let pi :— {j £ [N] : a,ij — 1}. To simplify 
notations, let us relabel the random variables {Xj : j £ pi} 
that add up to give Yi as X\,... ,X^ i , where Ni — \pi\. 



Observe that the MGF of Yi, after inversion, satisfies 

Y[(l + tm j ) = l/M Yi (t). (29) 

Using ([29]), we can then define the canonical polynomial 

/(x;t) := Y[(l + t Xj )-c(t), (30) 
j'=i 

where x = (zi, . . . ,XNi) and c(t) — l/My^t). Now choose 
an arbitrary set t = {tx, . . . , t^} C R++ consisting of 
distinct numbers and define 

F T (x) = (/ 1 (x),...,/ JVi (x))=0, (31) 

where /fc(x) = f(x;tk). We emphasize that this sys- 
tem is square of size Ni, depends on the choice of 
subset t and each polynomial ff. is symmetric with re- 
spect to the variables Xi 1 ...,xjf i . In fact, if we let 
c r = (c(ti),...,c(ijv 4 )) and £(x) = (ex(x), . . . , e Ni (x)), 
where e fe (x) = V P ; .... . v x h . ..x jk denotes the 
k th elementary symmetric polynomial in the JVj variables 
xi t . . . ,XNt, we can rewrite ( |3T| as 

T T S(x) - (c T - 1) = 0. (32) 

Here T T denotes a Vandermonde matrix of order N in 
ti,...,tif t with (T T )jk — tj. Its determinant, given by 

det(T T ) = fn"=i*i) llj>i( < j - is clearly non-zero. 
Premultiplying ( [32] > by T^r 1 , we obtain 

f(x)-T- 1 (c T -l)=0. (33) 

Observe now that the vector m = (mi , . . . , to^ ) is a natural 
root of (|3TJ and hence of ((33). Hence T 7 T 1 (c T — 1) = £(m). 
The EPS for this case can thus be written as 

Hi(x) = S(x)- £(m) = 0. (34) 

We next discuss the properties of this EPS, or more 
specifically, its solution set. For this, let V(Hi) := {x £ 
C N * : Hi(x) = 0}. 

Lemma 8: V(Hi) = ir m := {er(m) : a £ S 1 ^}. 

Proof: This follows directly from ( |34|> . ■ 
Lemma 9: For every x* £ V(Hi), det(£(x*)) ^ 0. 
Proof: This follows from Lemma <|8j and the fact that 

det(£(x)) = Ui<j<k<Ni( x 3 ~ x k)- ■ 
Because of Lemma^H] it suffices to work with only the 
first components of the roots. Hence we define 

M % := {a* £ C : 3x* £ V(H Z ) with x* = a}, (35) 

which in this case is equivalent to the set {mi, . . . ,TOjVi}- 
Reverting back to global notations, note that 

Mi = {rrij : j £ Pi }. (36) 

Since i was arbitrary, we can repeat the above procedure 
to obtain the collection of solution sets {M.i : i £ [m]}. 



Arguing as in Theorem |2j it is now follows that if A is 
1— identifiable then the rule 



m = n 



(37) 



where Qj 



= {i G [rn] : j G Pi} and Bj = [m]\Gj, satisfies 
the relation tp(j) — nij. That is, having obtained the sets 
{Mi : i G [to]}, one can use ip to match the parameters to 
the corresponding random variables. 

This clearly demonstrates that even if a transformation of 
the MGF is a polynomial in the parameters to be estimated, 
our method may be applicable. 

VI. Experimental Results 





(3 ) (4) (3> 

(a) Tree topology (b) General topology 

Fig. 1. Network topologies in simulation experiments. 

In order to verify performance, we conducted matlab based 



simulations using a network with i) tree topology, Fig. 1(a) 



1(a) 



and ii) general topology, Fig. 1(b) 

A. Network with Tree Topology 

We work here with the network of Figure 

Experiment 1: Node 1 is the source node, while nodes 3 
and 4 act as sink. The packet delay across link j. denoted Xj , 
is a hyperexponential random variable — a special case of the 
([TJ. The count of exponential stages in each link distribution 
equals three. That is, d = 2. The corresponding exponential 
stage parameters Ai,Aa and A3 are taken to be 5,3 and 1 
respectively. For each link, the weight associated with each 
exponential stage is given in first three columns of Table [I] 

TABLE I 

Actual and Estimated weights for Expt.Q] 



Link 


Wjl 


Wj 2 


w j3 


Wjl 


Wj 2 


w j3 


1 


0.17 


0.80 


0.03 


0.15 


0.82 


0.02 


2 


0.13 


0.47 


0.40 


0.15 


0.46 


0.39 


3 


0.80 


0.15 


0.05 


0.79 


0.15 


0.06 



Let pi be the path connecting the nodes 1,2 and 3. 
Similarly, let P2 be the path connecting the nodes 1,2 and 
4. Let Yi and Y 2 denote respectively the end-to-end delay 
across each of these paths. If we let Y = (Y\,Y 2 ) and 
X = (Xi,X 2 ,X 3 ) then it follows that Y = AX, where 



A = 



1 1 
1 1 



We are now in the framework of Section [TT1 

We first focus on path Observe that its EPS is given 
by the map of ( |15) , We collect now a million samples 
of its end-to-end delay. Choosing an arbitrary set r = 
{1.9857,2.3782,0.3581,8.8619}, we run the first phase of 
Algorithm [7] to obtain jCi 1 . This set along with its ideal 
counterpart is given in Table [II] 

TABLE II 

Solution set of the EPS for path pi in Expt.[JJ 



sol-ID 


Mi 


Mi 


1 


(0.1300, 0.4700) 


(0.1542, 0.4558) 


2 


(0.1700, 0.8000) 


(0.1292, 0.8356) 


3 


(3.8304, -2.8410) 


(3.8525, -2.8646) 


4 


(0.1933, 0.7768) 


(0.2260, 0.7394) 


5 


(0.1143, 0.4840) 


(0.0882, 0.5152) 


6 


(0.0058, -0.1323) 


(0.0052, -0.1330) 



Similarly, by probing the path p 2 with another million 
samples and with r = {0.0842,0.0870,0.0305,0.0344},^ 
determine M. 2 . The sets A4 2 and M. 2 are given in Table 



III 



TABLE III 

Solution set of the EPS for path p 2 in Expt.[JJ 



sol-ID 


M2 


M2 


1 


(0.8000, 0.1500) 


(0.7933, 0.1459) 


2 


(0.1660, 0.7775) 


(0.1720, 0.8095) 


3 


(5.5623, -4.5638) 


(5.5573, -4.5584) 


4 


(0.1700, 0.8000) 


(0.1645, 0.7669) 


5 


(0.8191, 0.1543) 


(0.8296, 0.1540) 


6 


(0.0245, -0.0263) 


(0.0246, -0.0259) 



To match the weight vectors to corresponding links, firstly 
observe that the minimum distance between A4i and M. 2 is 
0.0502. Based on this, we choose 5 = 0.03 and run the 
second phase of Algorithm [T] The obtained results, after 
rounding to two significant digits, are given in the second half 
of Table [I] Note that the weights obtained for the first link 
are determined by taking a simple average of the solutions 
obtained from the two different paths. The norm of the error 
vector equals 0.0443. 

Experiment 2: Keeping other things unchanged as in the 
setup of experiment [T] we consider here four exponential 
stages in the distribution of each Xj. The exponential stage 
parameters Ai, A 2 , A3 and A 4 equal 5, 4, 0.005 and 1 respec- 



tively. The corresponding weights are given in Table IV But 
observe that the weights of the third stage is negligible for 
all three links. We can thus ignore its presence. The results 
obtained after running Algorithm [T] are given in the second 
half of Table IV The norm of the error vector equals 0.1843. 



TABLE IV 

Actual and Estimated weights for Expt.[2] 



Link 


Wjl 


Wj 2 


Wj 3 


Wj4 


Wjl 


Wj 2 


Wj 3 


Wji 


1 


0.71 


0.20 


0.0010 


0.08 


0.77 


0.20 





0.03 


2 


0.41 


0.17 


0.0015 


0.41 


0.38 


0.18 





0.44 


3 


0.15 


0.80 


0.0002 


0.04 


0.12 


0.70 





0.18 



B. Network with General Topology 



We deal here with the network of Figure 1(b) 



Experiment 3: Nodes 1 and 2 act as source while nodes 
3 and 4 act as sink. We consider here three paths. Path p% 
connects the nodes 1,2 and 3, path p2 connects the nodes 
1, 2 and 4, while path p$ connects the nodes 2, 3 and 4. Let 
X = (Xi, X2, X3, X4), where Xj is the packet delay across 
link j. Also, let Y = (Yi, Y 2 ,Y 3 ), where Y^ denotes the end- 
to-end delay across path pi. They are related by Y = AX, 
where 

/ 1 1 
A= 1 1 
\ 1 1 

As in Experiment [T] the random variable Xj is hyperexpo- 
nentially distributed with values of d, Ai, A2, A3 kept exactly 
the same. By choosing again a million probe packets for each 
path, we run Algorithm [T] The actual and estimated weights 
are shown in Table [V] 

TABLE V 

Actual and Estimated weights for Expt.[3] 



Link 


Wjl 


Wj2 


Wj3 


Wjl 


Wj 2 


Wj 3 


1 


0.34 


0.26 


0.40 


0.34 


0.24 


0.42 


2 


0.46 


0.49 


0.05 


0.45 


0.50 


0.05 


3 


0.12 


0.65 


0.23 


0.11 


0.68 


0.21 


3 


0.71 


0.19 


0.10 


0.69 


0.18 


0.13 



VII. Discussion 

This paper took advantage of the properties of polynomial 
systems to develop a novel algorithm for the GNT problem. 
For any arbitrary 1— identifiable matrix A, it demonstrated 
successfully how to accurately estimate the distribution of the 
random vector X. with mutually independent components, 
using only IID samples of the components of the random 
vector Y = AX. Translating to network terminology, this 
means that one can now address the tomography problem 
even for networks with arbitrary topologies using only pure 
unicast probe packet measurements. The fact that we need 
only the IID samples of the components of Y shows that 
the processes to acquire these samples across different paths 
can be completely asynchronized. Another nice feature of 
this approach is that it can estimate the unknown link level 
performance parameters even when no prior information is 
available about the same. 
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